Lyrics segmentation via bimodal text–audio representation

نویسندگان

چکیده

Abstract Song lyrics contain repeated patterns that have been proven to facilitate automated segmentation, with the final goal of detecting building blocks (e.g., chorus, verse) a song text. Our contribution in this article is twofold. First, we introduce convolutional neural network (CNN)-based model learns segment based on their repetitive text structure. We experiment novel features reveal different kinds repetitions lyrics, for instance phonetical and syntactical properties. Second, using corpus where synchronized audio song, show modalities capture complementary structure combining both beneficial segmentation performance. For purely text-based dataset 103k achieve an F-score 67.4%, improving state art (59.2% F-score). On text–audio 4.8k songs, additional improve performance 75.3% F-score, significantly outperforming approaches.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Artist Attribution via Song Lyrics

Song lyrics, separated from the audio signal of their song, still contain a significant amount of information. Mood and meaning can still be conveyed effectively by a pure textual representation. There has even been somewhat successful previous work on genre classification from song lyrics[7]. Building on previous work, we seek to build an artist attribution system for song lyrics. This task is...

متن کامل

Robust Segmentation via Sparse Shape Representation

OF THE DISSERTATION Robust Segmentation via Sparse Shape Representation

متن کامل

Deformable Segmentation via Sparse Shape Representation

Appearance and shape are two key elements exploited in medical image segmentation. However, in some medical image analysis tasks, appearance cues are weak/misleading due to disease/artifacts and often lead to erroneous segmentation. In this paper, a novel deformable model is proposed for robust segmentation in the presence of weak/misleading appearance cues. Owing to the less trustable appearan...

متن کامل

Segmentation-Based Lyrics-Audio Alignment using Dynamic Programming

In this paper, we present a system for automatic alignment of textual lyrics with musical audio. Given an input audio signal, structural segmentation is first performed and similar segments are assigned a label by computing the distance between the segment pairs. Using the results of segmentation and hand-labeled paragraphs in lyrics as a pair of input strings, we apply a dynamic programming (D...

متن کامل

A New IRIS Segmentation Method Based on Sparse Representation

Iris recognition is one of the most reliable methods for identification. In general, itconsists of image acquisition, iris segmentation, feature extraction and matching. Among them, iris segmentation has an important role on the performance of any iris recognition system. Eyes nonlinear movement, occlusion, and specular reflection are main challenges for any iris segmentation method. In thi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Natural Language Engineering

سال: 2021

ISSN: ['1469-8110', '1351-3249']

DOI: https://doi.org/10.1017/s1351324921000024